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APPARATUS AND METHODS FOR MAXIMIZING 
SERVICE-LEVEL-AGREEMENT PROFITS 

1. Technical Field: 

5 The present invention is directed to an improved distributed computer system. More 

particularly, the present invention is directed to apparatus and methods for maximizing 
service-level-agreement (SLA) profits. 

2. Description of Related Art: 

10 

f . n As the exponential growth in Internet usage continues, much of which is fueled by the 

if growth and requirements of different aspects of electronic business (e-business), there is an 
I J increasing need to provide Quality of Service (QoS) performance guarantees across a wide range 
\t of high- volume commercial Web site environments. A fundamental characteristic of these 

1^5 commercial environments is the diverse set of services provided to support customer 
^ requirements. Each of these services have different levels of importance to both the service 
JZ providers and their clients. To this end, Service Level Agreements (SLAs) are established 
I* between service providers and their clients so that different QoS requirements can be satisfied. 
*f This gives rise to the definition of different classes of services. Once a SLA is in effect, the 

20 service providers must make appropriate resource management decisions to accommodate these 
SLA service classes. 

One such environment in which SLAs are of increasing importance is in Web server 
farms. Web server farms are becoming a major means by which Web sites are hosted. The basic 
architecture of a Web server farm is a cluster of Web servers that allow various Web sites to 
25 share the resources of the farm, i.e. processor resources, disk storage, communication bandwidth, 
and the like. In this way, a Web server farm supplier may host Web sites for a plurality of 
different clients. 

In managing the resources of the Web server farm, traditional resource management 
mechanisms attempt to optimize conventional performance metrics such as mean response time 
30 and throughput. However, merely optimizing performance metrics such as mean response time 
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and throughput does not take into consideration tradeoffs that may be made in view of meeting or 
not meeting the SLAs being managed. In other words, merely optimizing performance metrics 
does not provide an indication of the amount of revenue generated or lost due to meeting or not 
meeting the service level agreements. 

Thus, it would be beneficial to have an apparatus and method for managing system 
resources under service level agreements based on revenue metrics rather than strictly using 
conventional performance metrics in order to maximize the amount of profit generated under the 
SLAs. 
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SUMMARY OF THE INVENTION 

The present invention provides apparatus and methods for maximizing 
service-level-agreement (SLA) profits. The apparatus and methods consist of formulating SLA 

5 profit maximization as a network flow model with a separable set of concave cost functions at 
the servers of a Web server farm. The SLA classes are taken into account with regard to 
constraints and cost function where the delay constraints are specified as the tails of the 
corresponding response-time distributions. This formulation simultaneously yields both optimal 
load balancing and server scheduling parameters under two classes of server scheduling policies, 
10 Generalized Processor Sharing (GPS) and Preemptive Priority Scheduling (PPS). For the GPS 
case, a pair of optimization problems are iteratively solved in order to find the optimal 

*S parameters that assign traffic to servers and server capacity to classes of requests. For the PPS 

y~ case, the optimization problems are iteratively solved for each of the priority classes, and an 

f 11 optimal priority hierarchy is obtained. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

The novel features believed characteristic of the invention are set forth in the appended 
claims. The invention itself, however, as well as a preferred mode of use, further objectives and 
5 advantages thereof, will best be understood by reference to the following detailed description of 
an illustrative embodiment when read in conjunction with the accompanying drawings, wherein: 

Figure 1 is an exemplary block diagram illustrating a network data processing system 
according to one embodiment of the present invention; 

Figure 2 is an exemplary block diagram illustrating a server device according to one 
1 0 embodiment of the present invention; 

Figure 3 is an exemplary block diagram illustrating a client device according to one 
« embodiment of the present invention; 

{0 Figure 4 is an exemplary diagram of a Web server farm in accordance with the present 

fy invention; 

lSH Figure 5 is an exemplary diagram illustrating this Web server farm model according to 

^ l ^ the present invention; 

O Figures 6 A and 6B illustrate a queuing network in accordance with the present invention; 

i& Figure 7 is an exemplary diagram of a network flow model in accordance with the 

; !? present invention; 

Figure 8 is a flowchart outlining an exemplary operation of the present invention in a 
GPS scheduling environment; and 

Figure 9 is a flowchart outlining an exemplary operation of the present invention in a 
PPS scheduling environment. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

As mentioned above, the present invention provides a mechanism by which profits 
generated by satisfying SLAs are maximized. The present invention may be implemented in any 
5 distributed computing system, a stand-alone computing system, or any system in which a cost 
model is utilized to characterize revenue generation based on service level agreements. Because 
the present invention may be implemented in many different computing environments, a brief 
discussion of a distributed network, server computing device, client computing device, and the 
like, will now be provided with regard to Figures 1-3 in order to provide an context for the 
1 0 exemplary embodiments to follow. Although a preferred implementation in Web server farms 
will be described, those skilled in the art will recognize and appreciate that the present invention 

0 is significantly more general purpose and is not limited to use with Web server farms. 

1 fi With reference now to the figures, Figure 1 depicts a pictorial representation of a 
* z network of data processing systems in which the present invention may be implemented. 

W Network data processing system 100 is a network of computers in which the present invention 

s may be implemented. Network data processing system 100 contains a network 102, which is the 

' % « medium used to provide communications links between various devices and computers 
connected together within network data processing system 100. Network 102 may include 

C3 connections, such as wire, wireless communication links, or fiber optic cables. 

20 In the depicted example, a server 104 is connected to network 102 along with storage unit 

106. In addition, clients 108, 110, and 112 also are connected to network 102. These clients 
108, 110, and 112 maybe, for example, personal computers or network computers. In the 
depicted example, server 104 provides data, such as boot files, operating system images, and 
applications to clients 108-112. Clients 108, 110, and 112 are clients to server 104. Network 

25 data processing system 100 may include additional servers, clients, and other devices not shown. 

In the depicted example, network data processing system 100 is the Internet with network 
102 representing a worldwide collection of networks and gateways that use the TCP/IP suite of 
protocols to communicate with one another. At the heart of the Internet is a backbone of 
high-speed data communication lines between major nodes or host computers, consisting of 

30 thousands of commercial, government, educational and other computer systems that route data 
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and messages. Of course, network data processing system 100 also maybe implemented as a 
number of different types of networks, such as for example, an intranet, a local area network 
(LAN), or a wide area network (WAN). Figure 1 is intended as an example, and not as an 
architectural limitation for the present invention. 

5 In addition to the above, the distributed data processing system 100 may further include a 

Web server farm 125 which may host one or more Web sites 126-129 for one or more Web site 
clients, e.g. electronic businesses or the like. For example, the Web server farm 125 may host a 
Web site for "Uncle Bob's Fishing Hole" through which customers may order fishing equipment, 
a Web site for "Hot Rocks Jewelry" through which customers may purchase jewelry at wholesale 
10 prices, and a Web site for "Wheeled Wonders" where customers may purchase bicycles and 

O bicycle related items. 

If, A user of a client device, such as client device 108 may log onto a Web site hosted by the 

Web server farm 125 by entering the URL associated with the Web site into a Web browser 
; - application on the client device 108. The user of the client device 108 may then navigate the 
lfS Web site using his/her Web browser application, selecting items for purchase, providing personal 
} information for billing purposes, and the like. 

*~ With the present invention, the Web site clients, e.g. the electronic businesses, establish 

1 3 service level agreements with the Web server farm 125 provider regarding various classes of 
1 1 service to be provided by the Web server farm 125. For example, a service level agreement may 
20 indicate that a browsing client device is to be provided a first level of service, a client device 

having an electronic shopping cart with an item therein is provided a second level of service, and 

a client device that is engaged in a "check out" transaction is given a third level of service. 

Based on this service level agreement, resources of the Web server farm are allocated to the Web 

sites of the Web site clients to handle transactions with client devices. The present invention is 
25 directed to managing the allocation of these resources under the service level agreements in order 

to maximize the profits obtained under the service level agreements, as will be described in 

greater detail hereafter. 

Referring to Figure 2, a block diagram of a data processing system that may be 

implemented as a server, such as server 104 or a server in the Web server farm 125 in Figure 1, 
30 is depicted in accordance with a preferred embodiment of the present invention. Data processing 
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system 200 may be a symmetric multiprocessor (SMP) system including a plurality of processors 
202 and 204 connected to system bus 206. Alternatively, a single processor system may be 
employed. Also connected to system bus 206 is memory controller/cache 208, which provides an 
interface to local memory 209. I/O bus bridge 210 is connected to system bus 206 and provides 

5 an interface to I/O bus 212. Memory controller/cache 208 and I/O bus bridge 210 may be 
integrated as depicted. 

Peripheral component interconnect (PCI) bus bridge 214 connected to I/O bus 212 
provides an interface to PCI local bus 216. A number of modems may be connected to PCI bus 
216. Typical PCI bus implementations will support four PCI expansion slots or add-in 
10 connectors. Communications links to network computers 108-112 in Figure 1 may be provided 

|3 through modem 218 and network adapter 220 connected to PCI local bus 216 through add-in 

1% boards. 

IjH Additional PCI bus bridges 222 and 224 provide interfaces for additional PCI buses 226 

-2 and 228, from which additional modems or network adapters may be supported. In this manner, 
1£q data processing system 200 allows connections to multiple network computers. A 
)U memory-mapped graphics adapter 230 and hard disk 232 may also be connected to I/O bus 212 

as depicted, either directly or indirectly. 
D Those of ordinary skill in the art will appreciate that the hardware depicted in Figure 2 

|T may vary. For example, other peripheral devices, such as optical disk drives and the like, also 
20 may be used in addition to or in place of the hardware depicted. The depicted example is not 
meant to imply architectural limitations with respect to the present invention. 
The data processing system depicted in Figure 2 may be, for example, an IBM RISC/System 
6000 system, a product of International Business Machines Corporation in Armonk, New York, 
running the Advanced Interactive Executive (AIX) operating system. 
25 With reference now to Figure 3, a block diagram illustrating a data processing system is 

depicted in which the present invention may be implemented. Data processing system 300 is an 
example of a client computer. Data processing system 300 employs a peripheral component 
interconnect (PCI) local bus architecture. Although the depicted example employs a PCI bus, 
other bus architectures such as Accelerated Graphics Port (AGP) and Industry Standard 
30 Architecture (ISA) may be used. Processor 302 and main memory 304 are connected to PCI 
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local bus 306 through PCI bridge 308. PCI bridge 308 also may include an integrated memory 
controller and cache memory for processor 302. Additional connections to PCI local bus 306 
may be made through direct component interconnection or through add-in boards. In the 
depicted example, local area network (LAN) adapter 310, SCSI host bus adapter 312, and 
5 expansion bus interface 314 are connected to PCI local bus 306 by direct component connection. 
In contrast, audio adapter 316, graphics adapter 318, and audio/video adapter 319 are connected 
to PCI local bus 306 by add-in boards inserted into expansion slots. Expansion bus interface 314 
provides a connection for a keyboard and mouse adapter 320, modem 322, and additional 
memory 324. Small computer system interface (SCSI) host bus adapter 312 provides a 
10 connection for hard disk drive 326, tape drive 328, and CD-ROM drive 330. Typical PCI local 
O bus implementations will support three or four PCI expansion slots or add-in connectors. 
"}t An operating system runs on processor 302 and is used to coordinate and provide control 

JH; of various components within data processing system 300 in Figure 3. The operating system 
*2 may be a commercially available operating system, such as Windows 2000, which is available 
l|§ from Microsoft Corporation. An object oriented programming system such as Java may run in 
^ conjunction with the operating system and provide calls to the operating system from Java 

programs or applications executing on data processing system 300. "Java" is a trademark of Sun 
|3 Microsystems, Inc. Instructions for the operating system, the object-oriented operating system, 
12 and applications or programs are located on storage devices, such as hard disk drive 326, and 
20 may be loaded into main memory 304 for execution by processor 302. 

Those of ordinary skill in the art will appreciate that the hardware in Figure 3 may vary 
depending on the implementation. Other internal hardware or peripheral devices, such as flash 
ROM (or equivalent nonvolatile memory) or optical disk drives and the like, may be used in 
addition to or in place of the hardware depicted in Figure 3. Also, the processes of the present 
25 invention may be applied to a multiprocessor data processing system. 

As another example, data processing system 300 may be a stand-alone system configured 
to be bootable without relying on some type of network communication interface, whether or not 
data processing system 300 comprises some type of network communication interface. As a 
further example, data processing system 300 maybe a Personal Digital Assistant (PDA) device, 
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which is configured with ROM and/or flash ROM in order to provide non-volatile memory for 
storing operating system files and/or user-generated data. 

The depicted example in Figure 3 and above-described examples are not meant to imply 
architectural limitations. For example, data processing system 300 also may be a notebook 
5 computer or hand held computer in addition to taking the form of a PDA. Data processing 
system 300 also may be a kiosk or a Web appliance. 

The present invention provides a mechanism by which resources are managed so as to 
maximize the profit generated by satisfying service level agreements. The present invention will 
be described with regard to a Web server farm, however the invention is not limited to such. As 
10 mentioned above, the present invention may be implemented in a server, client device, 

stand-alone computing system, Web server farm, or the like. 
U With the preferred embodiment of the present invention, as shown in Figure 4, a Web 

I j server farm 400 is represented by a distributed data processing system consisting of M 
= Z heterogeneous servers that independently execute K classes of request streams, where each 
l§f request is destined for one of N different Web client Web sites. As shown in Figure 4, the Web 
s server farm 400 includes a request dispatcher 410 coupled to plurality of servers 420-432. The 
~Jz request dispatcher 410 receives requests via the network 102 destined for a Web site supported 
•JZ' by the Web server farm 400. The request dispatcher 410 receives these requests, determines an 
O appropriate server to handle the request, and reroutes the request to the identified server. The 
20 request dispatcher 410 also serves as an interface for outgoing traffic from the Web server farm 
400 to the network 102. 

Every Web site supported by the Web server farm 400 has one or more classes of requests 
which may or may not have service level agreement (SLA) requirements. The requests of each 
class for each Web site may be served by a subset of the servers 420-432 comprising the Web 
25 server farm 400. Further, each server 420-432 can serve requests from a subset of the different 
class- Web site pairs. 

To accommodate any and all restrictions that may exist in the possible assignments of 
class-Web site pairs to servers (e.g., technical, business, etc.), these possible assignments are 
given via a general mechanism. Specifically, if A(ij,k) is the indicator function for these 
30 assignments, A(ij,k) takes on the value 0 or 1, where 1 indicates that class k requests destined for 
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Web site j can be served by server i and 0 indicates they cannot. Thus, A(i,j,k) simply defines the 
set of class- Web site requests that can be served by a given server of the Web server farm. 

The present invention provides a mechanism for controlling the routing decisions 
between each request and each server eligible to serve such request. More precisely, the present 
5 invention determines an optimal proportion of traffic of different classes to different Web sites to 
be routed to each of the servers. Thus, the present invention determines which requests are 
actually served by which servers in order to maximize profits generated under SLAs. 

Web clients use the resources of the Web server farm 400 through their navigation 
behavior on the hosted Web sites. This navigational behavior is characterized by Web sessions 
10 consisting of a sequence of alternating actions. A typical Web client scenario might consist of 

several browse requests, possibility followed by an add-to-shopping cart request or buy 
O transaction request, in an iterative manner. Between requests, there may be client device-based 
[j delays, which can represent user "think times," fixed time intervals generated by a computer 
1 S ( e -g-> Web crawlers or the like), Web browser application delays (e.g., upon requesting embedded 
ll&J images), and the like. This sequence can be finite or infinite, with the latter case corresponding 
3 to Web crawler activities. For a single session, the requests may belong to different classes and 
* Z the think times may be of different types. 

The present invention is premised on the concept that revenue may be generated each 
13 time a request is served in a manner that satisfies the corresponding service level agreement. 
2t) Likewise, a penalty may be paid each time a request is not served in a manner that satisfies the 
corresponding service level agreement. The only exception to this premise is "best efforts" 
requirements in service level agreements which, in the present invention, have a flat rate pricing 
policy with zero penalty. Thus, the profit generated by hosting a particular Web site on a Web 
server farm is obtained by subtracting the penalties from the revenue generated. The present 
25 invention is directed to maximizing this profit by efficiently managing the Web server farm 
resources. 
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Web Server Farm Model 

With the present invention, the Web server farm is modeled by a multiclass queuing 
5 network composed of a set of M single-server multiclass queues and a set of NxKxK queues. The 
former represents a collection of heterogeneous Web servers and the latter represents the client 
device-based delays (or "think times") between the service completion of one request and the 
arrival of a subsequent request within a Web session. Figure 5 is an exemplary diagram 
illustrating this Web server farm model. For convenience, the servers of the first set, i.e. the Web 
10 servers, are indexed by z, i=l,...,M and those of the second set (delay servers) are indexed by (/, k, 
k% the Web client sites by 7,7=1, ...,iV, and the request classes by k, £=1,...^. 

0 For those M single-server multiclass queues representing the Web servers, it is assumed, 
2\1 for simplicity, that each server can accommodate all classes of requests, however the invention is 

1 if not limited to such an assumption. Rather, the present invention may be implemented in a Web 
Ml server farm in which each server may accommodate a different set of classes of requests, for 

I example. The service requirements of class k requests at server i follow an arbitrary distribution 
* i with mean The capacity of server i is denoted by Q. 

H The present invention may make use of either a Generalized Processor Sharing (GPS) or 

r 3 Preemptive Priority Scheduling (PPS) scheduling policy to control the allocation of resources 
26 across the different service classes on each server. Under GPS, each class of service is assigned 
a coefficient, referred to as a GPS assignment, such that the server capacity is shared among the 
classes in proportion to their GPS assignments. Under PPS, scheduling is based on relative 
priorities, e.g. class 1 requests have a highest priority, class 2 requests have a lower priority than 
class 1, and so on. 

25 In the case of GPS, the GPS assignment to class k on server i is denoted by f irk with the 

sum of/,* over the range of fc=l to k=K being 1 . Thus, at any time t, the server capacity devoted 
to class k requests, if any, is f ifk C/S k-cmfik, where Kfi) is the set of classes with backlogged 
requests on server i at time t. Requests within each class are executed either in a First-Come- 
First Served (FCFS) manner or in a processor sharing (PS) manner. User transactions from a 

30 client device destined for a Web site 7 that begin with a class k request, arrive to the distributed 

Docket No. YOR9200 10031 US 1 

11 



Express Mail No. EL750738760US 



data processing system, i.e. the Web server farm, from an exogenous source with rate h®. Upon 
completion of a class k request, the corresponding Web site j user transaction either returns as a 
class k' request with probability p® kX following a delay at a queue having mean (d 0) or 
completes with probability 1-S*=i p%. The matrix P 0) =\p\ k ] is the corresponding request 

5 feedback matrix for Web site j which is substochastic and has dimension KxK. This transition 
probability matrix P (i) defines how each type of Web site j user transaction flows through the 
queuing network as a sequence of server requests and client think times. Thus, this matrix may 
be used to accurately reflect the inter-request correlations resulting from client-server 
interactions. The client device think times can have arbitrary distributions, depending on the 
10 Web site and the current and future classes. These think times are used in the model to capture 

3 ^ the complete range of causes of delays between the requests of a user session including computer 
delays (e.g., Web crawlers and Web browsers) and human delays. 

|d L k w denotes the rate of aggregate arrivals of Web site/, class k requests to the set of 

\z servers of the Web server farm. The rate of aggregate arrivals may be determined based on the 
ijSf exogenous arrival rates and the transition probabilities as follows: 

s K 

U L k ®= SZ^V + fc^7=l 5 .-^^l ? ...^ (1) 

JS While the above models uses a Markovian description of user navigational behavior, the 

present invention is not limited to such. Furthermore, by increasing the number of classes and 
thus, the dimensions of the transition probability matrix, any arbitrary navigational behavior with 
particular sequences of request classes maybe modeled. In such cases, many of the entries of the 
transition probability matrix P 0) will be 0 or 1 . In so doing, any arbitrary distribution of the 

25 number of requests within a session may be approximated. 

Cost Model 

As mentioned above, the present invention is directed to maximizing the profit generated 
30 by hosting a Web site on a Web server farm. Thus, a cost model is utilized to represent the costs 
involved in hosting the Web site. In this cost model, k if k w is used to denote the rate of class k 
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requests destined for Web site j that are assigned to server i by the control policy of the present 
invention. The scheduling discipline, either GPS or PPS, at each single-server multiclass queue 
determines the execution ordering across the request classes. 

The cost model is based on the premise that profit is gained for each request that is 
5 processed in accordance with its per-class service level agreement. A penalty is paid for each 
request that is not processed in accordance with its per-class service level agreement. More 
precisely, assume T k is a generic random variable for the class k response time process, across all 
servers and all Web sites. Associated with each request class k is a SLA of the form: 

10 P[T k >z k ][a k (2) 

A 3 where z k is a delay constraint and a k is a tail distribution objective. In other words, the class k 
I J SLA requires that the response times of requests of class k across all Web sites must be less than 

or equal to z k at least (1-^)*100 percent of the time in order to avoid SLA violation penalties. 
1SJ Thus, the cost model is based on incurring a profit P k + for each class k request having a response 
s time of at most z k (i.e. satisfying the tail distribution objective) and incurring a penalty P k ~ for 

each class k request which has a response time exceeding z k (i.e. fails the tail distribution 
VI objective). 

O One request class is assumed to not have an SLA and is instead served on a best effort 

2:0" basis. The cost model for each best effort class k is based on the assumption that a fixed profit 
Pit is gained for the entire class, independent of the number of class k requests executed. For 
simplicity, it will be assumed that there is only one best effort class, namely class K, however it 
is straightforward to extend the present invention to any number of best effort classes. 

25 SLA Profit Maximization: GPS Case 

Resource management with the goal of maximizing the profit gained in hosting Web sites 
under SLA constraints will now be considered. As previously mentioned, the foregoing Web 
server farm model and cost models will be considered under two different local scheduling 
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policies for allocating server capacity among classes of requests assigned to each server. These 
two policies are GPS and PPS, as previously described. The GPS policy will be described first. 

hi the GPS policy case, the aim is to find the optimal traffic assignments k k 0) and GPS 
assignments f itk that maximize the profit, given the class- Web site assignments A(iJ,k) and the 
5 exogeneous arrival rates k k 0) which yields the aggregate arrival rates L k 0) through equation (1). 
Here k it k 0) denotes the rate of class k, Web site j requests assigned to server i and f iik denotes the 
GPS assignment for class k at server L 

Routing decisions at the request dispatcher 410 are considered to be random, i.e. a class k 
request for site j is routed to server / with probability k it k 0) /S M i>=iki> k °\ independent of the past and 
10 future routing decisions. When other routing mechanisms are used, such as weighted round 
f * robin, the resource management solutions of the present invention may be used for setting the 
li parameters of these mechanisms (e.g., the weights of the weighted round robin), thus yielding 
I y suboptimal solutions . 

'J* With the present invention, the queuing network model described above is first 

ljj decomposed into separate queuing systems. Then, the optimization problem is formulated as the 
^ sum of costs of these queuing systems. Finally, the optimization problem is solved. Thus, by 
summing the profits and penalties of each queuing system and then summing the profits and 
penalties over all of the queuing systems for a particular class k request to a Web site z, a cost 
H function may be generated for maintaining the Web site on the Web server farm. By maximizing 
20 the profit in this cost function, resource management may be controlled in such a way as to 
maximize the profit of maintaining the Web site under the service level agreement. 

In formulating the optimization problem as a sum of the costs of the individual queuing 
systems, only servers 1,...,M need be considered and these queuing systems may be considered to 
have arrivals of rate k if k hS N j= i^ k 0) for each of the classes on each of the servers 

25 z=l,...,M. The corresponding queuing network is illustrated in Figures 6A and 6B. 

These queuing systems are analyzed by deriving tail distributions of sojourn times in 
these queues in view of the SLA constraints. Bounding techniques are utilized to decompose the 
multiclass queue associated with server i into multiple single-server queues with capacity f irk d. 
The resulting performance characteristics (i.e. sojourn times) are upper bounds on those in the 
30 original systems. 
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For simplicity of the analysis and tractability of the optimization, the GPS assignments 
are assumed to satisfy k ir k <li,4i r kCu It then follows from standard queuing theory that the equation 
(2) can be bound on the left-hand side by 

P[T k >z k ][exp(-(l^^ (3) 

Hence, the SLA constraint is satisfied when 

P[T k >z k ][exp(<li, k fi )k Ci-k u )z k )[a k? k=l,...,K-l (4) 

Next, the optimization problem is divided into separate formulations for the SLA-based 
classes and the best effort class. As a result of equation (4), the formulation of the SLA classes is 
given by: 

M K-l 

Max S S P k + k i , k -(P k + +P k -)ki,kexp(-(li,kfi > kC i -k i , k )z k ) (5) 

i=l k=l 

s.t k i>k [ ln(a k zu)/z k + l i)k fi, k Ci, i=l,...,M, k=l,...,KA; 

7=1 
M 

S ^^=I^j=l,...^,A=l,...^-l; 

kJ>=Q, ifA(i,j,k)=0, fcl,...,K-\; Vmj 
k^mO, ifA(i,j,k)=l, k=l,...,K-\; %y 

S =/ it [1, 

where z k is a scaling factor for the SLA constraint a k , and Ci is the capacity of server i. Here, the 
k ir k 0) dx\df itk are the decision variables sought and P k + , P k , C h k$\z k ,a k ^ k &vA l i;k are input 
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parameters. By allowing z k to go to infinity for any class k, the cost model makes it possible to 
include the objective of maximizing the throughput for class L 

The formulation for the optimal control problem for the best efforts classes attempts to 
minimize the weighted sum of the expected response time for class K requests over all servers, 
and yields: 

M 

Min S n i ,K((S K k = 1 k iik b i /))/(2(l-qi,K-. + )(l-q i> K + )) 

i=i 

+ bi, K /(l-qi,K-i + )) (6) 

s.t. kkUinQ, i=\,...M; 

_ ka 

Ci = Ci-S k Lk /\ iM ,i=l,...M; 



M 



i=l 

* iJE ®=0, if A(ij t K)=^ i=l 9 ...M,j=U-..M 
k itK ®mO, if A(iJ f K)=U r=l,...M 9 j=l 9 ...J*. 

where n^ K is the weighting factor for the expected response time of the best effort class K on 
server i, b ijk , b i)k (2) are the first two moments of the service times (bi, k = WO), and qi, k + is the total 
load of classes l v ..,k: 



k k 

qi,k + = Sq ijk 'hSk U 'bi )k 7Ci (7) 

k'=l k'=l 



30 Here, k i; k (j) are the decision variable that are sought and the remaining variables are input 
parameters. 

The expression of the response time in the above cost function comes from the queuing 
results on preemptive M/G/l queues, which are describe in, for example, H. Takasi, Queuing 
Analysis, vol. 1, North-Holland, Amsterdam, 1991, pages 343-347, which is hereby incorporated 
35 by reference. The use of these queuing results is valid since the SLA classes are assigned the 
total capacity. The GPS assignment for best effort class requests is 0, which results in a priority 
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scheme between SLA classes and the best effort class. Furthermore, owing to the product-form 
solution, a Poisson model may be used for higher priority class requests. 

The weights n ijK are included in the formulation as they may be greater use when there are 
multiple best efforts classes, e.g., classes K through K\ As a simple example, the weights n i)K 
5 may be set to . .+/W + ) in this case. 

In the above formulation of equation (5), the scaling factors z*>l are used to generalize 
the optimization problem. Several practical considerations motivate the use of such scaling 
factors. Observe first that the use of most optimization algorithms requires that the cost 
functions be explicit and exhibit certain properties, e.g., differentiability and 
10 convexity/concavity. However, for scheduling policies like GPS, the resulting queuing model is 

usually very difficult to analyze and bounding or approximation techniques have to be used in 
w order to obtain tail distributions. Such an approach results in a bound for the GPS scheduling 
I j policy. Thus, the use of the scaling factors allows this bias to be corrected. 
§ 2 Secondly, queuing models are only mathematical abstractions of the real system. Users 

ltJf of such queuing models usually have to be pessimistic in the setting of model parameters. Once 
~ again, the scaling factors z*>l may be useful to bridge the gap between the queuing theoretic 
% analysis and the real system. Furthermore, the scaling factors zp>l make it possible for the 
f !/ hosting company to violate the SLA to a controlled degree in an attempt to increase profits under 
O equation (5), whereas the hosting company will strictly follow the predefined SLA whenever 

io Zk =i. 

There are two sets of decision variables in the formulation of the optimal control problem 
shown in equation (5), namely, h lt jP and where the latter variables control the local GPS 
policy at each server. To address this problem, two subproblems are considered in an iterative 
manner using the same formulation with appropriately modified constraints to solve for the 
25 decision variables k it k 0) andf irk . Specifically, the following equations are iteratively solved to 
solve for the decision variables: 

M K-l 

Max S S P k ^ u -(Pk + +P* (8) 

30 i=1M 

s.t. k ifh [ ln(a k z k )/z k + l^fyA, /=1,...,M, k=l,...,K-l; 
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10 



N 

s kj } = h k , i=i,...M, k=\,...jK-\; 

j=i 

M 

hi 

Au®=0, XA(ij,k)=Q, k=l,...,K-l; % u ; 
kiFmO, if A(i,j,k)=\, A=l,...,£-1; Vmj. 



M K-l 

Max S S P k + k u -(P k + +P k -)^ (9) 

i=l M 



15 

|3 s.t. f ir *m k if js/lycCi - ln(a k z k )/z k \ k C[, p=l,...,M, 

;ij Jfc=l,...,£-1; 

Equation (8) is an example of a network flow resource allocation problem. Both equations (8) 
W and (9) can be solved by decomposing the problem into M separable, concave resource allocation 
O problems, one for each class in equation (8) and one for each server in equation (9). The 
12 optimization problem (8) has additional constraints corresponding to the site to server 
H assignments. The two optimization problems shown in equations (8) and (9) then form the basis 
M for a fixed-point iteration. In particular, initial values are chosen for the variables f i>k and 
equation (8) is solved using the algorithms described hereafter to obtain the optimal control 
variables k, k or . This set of optimal control variables k if k 0) * are then substituted into equation (9) 
and the optimal control variables f& are obtained. This iterative procedure continues until a 
difference between the sets of control variables of an iteration and those of the previous iteration 
30 is below a predetermined threshold. The optimization problems are defined more precisely as 
follows. 

Optimization Algorithms 

35 There are, in fact, two related resource allocation problems, one a generalization of the 

other. Solutions to both of these problems are required to complete the analysis. Furthermore, 
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the solution to the special problem is employed in the solution of the general problem, and thus, 
both will be described. 

The more general problem pertains to a directed network with a single source node and 
multiple sink nodes. There is a function associated with each sink node. This function is 
5 required to be increasing, differentiable, and concave in the net flow into the sink, and the overall 
objective function is the (separable) sum of these concave functions. The goal is to maximize 
this objective function. There can be both upper and lower bound constraints on the flows on 
each directed arc. In the continuous case, the decision variables are real numbers. However, for 
the discrete case, other versions of this algorithm may be utilized. The problem thus formulated 
10 is a network flow resource allocation problem that can be solved quickly due to the resulting 

constraints being submodular. 
^ Consider a directed network consisting of nodes V and directed arcs A. The arcs a v j v2 c A 

'f^ carry flow f viV 2 from nodes v } c Vto nodes v 2 cV. The flow is a real variable which is constrained 
^ to be bounded below by a constant i vlv2 and above by a constant u vlv2 . That is, 

m 

I" Ulv2 [fvlv2 [ U v iv2 (10) 

H for each arc a v}v2 . It is possible, of course, that i vlv2 = 0 and u v}v2 = °. There will be a single source 
H node s c V satisfying Sa v2 f sv2 - Sa v] f vls = R > 0. This value R, the net outflow from the source, is a 
2b" constant that represents the amount of resource available to be allocated. There are N sinks v 2 c 
N " A which have the property that their net inflow Sa vlv2 f v i v2 - Sa v2 v3fv2v3 > 0. All other nodes v 2 c 
A-{s} - AT are transhipment nodes that satisfy Sa vlv2 f vlv2 -Sa V 2 V 3fv2v3 = 0. There is a single 
increasing, concave, differentiable function F v2 for the net flow into each sink node j. So the 
overall objective function is 

25 

S F v2 (S fvl v2 - S f v 2v3) (11) 
V2cN avlv2 av2v3 

which is sought to be maximized subject to the lower and upper bound constraints described in 
30 equation (10). 



Docket No. YOR920010031US1 



Express Mail No. EL750738760US 



A special case of this problem is to maximize the sum 

S (F v2 (x v2 )) (12) 
5 V2=l 

of a separable set of N increasing, concave, differentiable functions subject to bound constraints 

lv2 [ Xv2 [ U V 2 (13) 

10 

and subject to the resource constraint 

* - N 

iJ Sx v2 = R (14) 

s for real decision variables x v2 . In this so-called separable concave resource allocation problem, 
*2 the optimal solution occurs at the place where the derivatives F v2 {x v2 ) are equal and equation (14) 
IZ„ holds, modulo the bound constraints in equation (13). 

2P More precisely, the algorithm proceeds as follows: If either S N v2 =i l v2 >Ror S N v2 =i u v2 <R, 

there is no feasible solution and the algorithm terminates. Otherwise, the algorithm consists of 
an outer bisection loop that determines the value of the derivative D and a set of N inner 
bisection loops that find the value of l v2 [ x y2 [ u v2 satisfying F v2 (x v2 ) = D if F v2 \l v2 ) [ D and 
F v2 \u v2 ) m D. Otherwise x v2 is set to l v2 (in the first case), or x v2 is set to u v2 (in the second case). 

25 The initial values for the outer loop can be taken as the minimum of all values F v2 \l v2 ) and the 
maximum of all values F v2 '(u v2 ). The initial values for the u 2 -th inner loop can be taken to be l v2 
and u v2 . 

The more general problem is itself a special case of the so-called submodular constraint 
resource allocation problem. It is solved by recursive calls to a subroutine that solves the 
30 problem with a slightly revised network and with generalized bound constraints 
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hlv2 ' [fvlv2 [ Uvlv2 (15) 

instead of those in equation (10). As the algorithm proceeds it makes calls to the separable 
5 concave resource allocation problem solver. More precisely, the separable concave resource 
allocation problem obtained by ignoring all but the source and sink nodes is solved first. Let x v2 
denote the solution to that optimization problem. 

In the next step, a supersink t is added to the original network, with directed arcs jt from 
each original sink, forming a revised network (V',A L jt ' is set to 0 and u jt ' is set to x v2 for all 
10 arcs connecting the original sinks to the supersink. For all other arcs, the lower and upper 
bounds remain the same. Thus, l v i v2 ' = Um and u v i v2 = u viv2 for all arcs a v}v2 . The so-called 
I % maximum flow problem is then solved to find the largest possible flow f vlv2 through the network 
n (V',A ') subject to constraints in equation (10). A simple routine for the maximum flow problem 
f U is the labeling algorithm combined with a path augmentation routine. Using the residual network 
l§5j one can simultaneously obtain the minimum cut partition. For definitions of these terms, please 
see Ahvja, Magnant, & Orlin, Network Flows, Prentice Hall, Englewood Cliffs, NJ 1993, pages 
O 44-46, 70, 163 and 185, which is hereby incorporated by reference. Those original sink nodes j 
which appear in the same partition as the supersink are now regarded as saturated. The flow f v2t 
becomes the lower and upper bounds on that arc. Thus, l v2t ' is set to u \ 2t which is equal to f v2t . 
2fCf For all remaining unsaturated arcs j, l y2t ' is set to x v2 and u v2t ' is set equal to f v2t . Now the process 
is repeated, solving the separable concave resource allocation problem for the unsaturated nodes 
only, with suitably revised total resource, and then solving the revised network flow problem. 
This process continues until all nodes are saturated, or an infeasable solution is reached. 

An example of the network flow model is provided in Figure 7. In addition to the source 
25 node s 9 there are NK nodes corresponding to the sites and classes, followed by two pairs of MK 
nodes corresponding to the servers and classes, and a supersink t. In the example, M=N=K=3. In 
the first group of arcs, the (j\k)th node has capacity equal to L k °K The second group of arcs 
corresponds to the assignment matrix A(iJ,k) 9 and these arcs have infinite capacity. The 
capacities of the third group of arcs on (/,&) correspond to the SLA constraints. The duplication 
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of the nodes here handles the fact that the constraint is really on the server and class nodes. The 
final group of arcs connects these nodes to the supersink t. 

SLA Profit Maximization: PPS Case 

In formulating the SLA based optimization problem under the PPS discipline for 
allocating server capacity among the classes of requests assigned to each server, the approach is 
again to decompose the model to isolate the per-class queues at each server. However, in the 
PPS case, the decomposition of the per-class performance characteristics for each server i is 
performed in a hierarchical manner such that the analysis of the decomposed model for each 
class k in isolation is based on the solution for the decomposed models of classes 1 V ,..,&-1. 

Assuming that the lower priority classes do not interfere with the processing of class 1 
requests, as is the case under PPS, then the product-form results derived above indicate that the 
arrival process to the class 1 queue is a Poisson process. Hence, equation (5) still holds for class 
1 requests which then leads to the following formulation for the class 1 optimal control problem: 

M 

Max S PfkriPS+POkueM-iluCrhdz!) (16) 

1=1 

s.t. k iA [ ln(aizi)/zi + lufuCi, f=1,...,M; 

N 

S kJ> = k u ,i=\,...M; 

J=l 
M 

S k i / ) = L 1 < i) ,j=l,...M 

i=\ 

KP=0, ifA(ij,l)=0, i=\,...,M,j=h...M 
k u 0) mO, ifA(ij,l)=l, i=l,...,M,j=h-M 

where ku® are the decision variables and all other variables are as defined above. 

Upon solving equation (16) to obtain the optimal control variables ki/ } \ it is sought to 
statistically characterize the tail distribution for the class 2 queue which will then be used 
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(recursively) to formulate and solve the optimization problem for the next class(es) under the 
PPS ordering. Thus, for any class k equal to 2 or more, it is assumed that there are constants c itk 
and htf such that 

5 P[7},p>x]jc a e-^ i=l,...M,k=K..JC (17) 

Assuming that the optimization problem for classes 1,...,A>1 have been solved, the control 
problem for class k can be formulated as: 

10 M 

Max S (18) 

i=l 

m h k[ ln(a k zi)/zk + (Ci-Spi, k )\ iM , i=l,...M', 

l| 

i y n 
20 

:= S kk® = L k ® 9 j=l 9 ...M 

2§ kk w =0, ifACij,k)=Q 9 i=h..Mj=U--,N; 

^ Au®m0, ifA(iJ,k)=l 9 i=h...,M,j=l,-Jf; 

where k\, k w are the decision variables and all other variables are as defined above. 

In order to apply the optimization algorithms described above, appropriate parameters for 
30 d, k and h i>k must be selected. In one embodiment, the parameters for c irk and h ifk are selected based 
on fitting the parameters with the first two moments of the response time distribution, however, 
other methods of selecting parameters for c itk and h itk may be used without departing from the 
spirit and scope of the present invention. 

It follows from equation (17) that for z=l,...,M, k=l 9 .„ 9 K: 

35 

ET ijk - c i>k lh iM ET i)k 2 = 2c ifk /h itk 2 (19) 
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so that 

h ik = 2ETi, k /ETi, k , a, k = 2ET i)k 2 /ET Uc (20) 

5 

where ETi )k and ET i k 2 are the first and second moments of Tj, k , respectively. Using known 
formulae for ETi, k and ETj, k 2 , the equations become: 

ET i>k = (SV=,k, * V W #(l-?tw + X Wtf)) + b^l-q^) (21) 

10 

and 

p 

| J ET* 2 = (SV-A ,-6a^/3(l-^^ + ) 2 (l^a + )) + V 2 V(1 W) 2 
% + ((S W t + 

l| S^*4**u^(l W^ETy, (22) 

]S where b it k>,bi,k' (2) bi,k' (3) are the first three moments of the service times, q ir k is the total load of 
VZ classes 1 

i0 = S**w ^ h SV=/^ * Ai^G (23) 

When the service requirements can be expressed as mixtures of exponential distributions, 
the cost functions of equation (18) are concave. Therefore, the network flow model algorithms 
can be recursively applied to classes 1,2,...^. 

25 

SLA Profit Maximization: General Workload Model 

The optimization approach described above can be used to handle the case of even more 
general workload models in which the exponential exogenous arrival and service process 
30 assumptions described above are relaxed. As in the previous cases, analytical expressions for the 
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tail distributions of the response times are derived. In the general workload model, the theory of 
large deviations is used to compute the desired functions and their upper bounds. 

Consider a queuing network composed of independent parallel queues, as shown in 
Figures 6A and 6B. The workload model is set to stochastic processes Ujp } (t) representing the 
5 amount of class k work destined for site j that has arrived during the time interval (0,0* The 
workload model defined in this mariner corresponds to Web traffic streams at the request level 
rather than at the session level, which was the case described above with regard to Figures 6A 
and 6B. 

Let bi t k be the decision variables representing the proportion of traffic of Uif j) to be sent to 
10 server i: U i>k w (t) = b itk A(ij,k) U k w . Let V i>k (t) be the potential traffic of class k set to server i during 
the time interval (0,t): 

id V i , k (t) = S N M A(iJ,k)U k ® (25) 

IP and define 

H qut = ]im (VtWrft) (26) 

2|Jr to be associated asymptotic potential load of class k at server z, provided the limit exists. Further 
let 

Uuc(t) = SVA^t) = S N j=1 buAUWJPbuVdt) (27) 
25 denote the class k traffic that has been sent to server i during the time interval (0,*). Thus, 

Xim (Vt)Ui, k (t) =b U fHt (28) 

30 Assume again that GPS is in place for all servers with the capacity sharing represented by 

the decision variables/,*. The SLA under consideration is 
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P[W^ Zlc ][a k ,k=l,...J[-\ 



(29) 



where W k is the remaining work of class k at any server. 

Bounds of the tail distributions of the remaining class k work at server i are considered by 
analyzing each of the queues in isolation with service capacity /^C, (see Figure 6B). Such 
bounds exist for both arbitrary and Markovian cases. For tractability of the problem, it is 
assumed that b iik q iik <fi, k Ci. 

Only asymptotic tail distributions given by the theory of large deviations are considered: 



where W i>k is the remaining work of class k at server /. In order to apply the large deviations 
principal, it is assumed that for all z=l,...,Mand k=l,...JC, the following assumptions hold: 

(Al) the arrival process V^(t) is stationary and ergodic (see S. Karlin et al., A First 
Course in Stochastic Processes, 2nd Ed., Academic Press, San Diego, CA, 1975, pages 443 and 
487-488, which is hereby incorporated by reference); and 



exists, and L^h) is strictly convex and differentiate. 

Note that for some arrival processes, assumption (A2) is valid only through a change of 
scaling factor. In this case, the asymptotic expression of the tail distribution of the form in 
equation (30) could still hold, but with a subexponential distribution instead of an exponential 
one. 

It then follows that under assumptions Al and A2, the arrival processes Vi,k(t) satisfy the 
large deviations principal with the rate function 



P[W ifk >z k ] i exp(-hi, k z k ) 



(30) 



(A2) for all 0</*<°, the limit L i , k (^)=lim(l/01og e ex J p(7z^ fc (^) 



Li, k (a)* = sup(/*a - L i)k (/j)) 



(31) 
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where L ijk (^)* is the Legendre transform of Li, k (/z). 
Now let 

5 Uxikkk) = lim (l/t)logEe™ 

«?» 

= lim(l/t)logEe A ^™ 

tt° 

10 Then, l^xiKb^ = LixQifyk), and thus, the exponential decay rate h/ is defined by 

hi/ = sup {he: Ux(hb ik )/h<faCi} (32) 

O This exponential decay rate is a function of Q k h faQ and which will be denoted by 

lfi htfXbihCik). As hi/(bi t k,Ci t k) decreases and is differentiable in 6, x i>k {CJi) can be defined as the 

5 ff inverse of with respect to 6, i.e. x^C^t/ibi^C^)) = ft. The corresponding 

; « optimization problem may then be formulated as: 



2# Max S S ^^,-(P, + +P^^ k exp(-/? i)k z k ) (33) 

1=1 k=l 

il s.t. Z>, ; i[ xu!(fc£ b -log(a k Zk)/z k ), i=\,...M, (34) 

fc=l,...,£-l; 

25 h k <fi, k Ci/p iM i=i,...M,^K~#-U 

M 

S ft u =l,*=l,...^-l; 

30 

S/ tt [l,i=l,...^f. 

35 Note that the constraint in equation (34) comes from the relaxed SLA requirement 

hi/m-log(a k Zk)/z k . 

Owing to the above, the function 6exp(-z/z//(&,C)) is convex in b, so that the cost 
function is also concave in the decision variables *. Thus, the optimization alogrithms 
described above may be iteratively used to solve the control problem. 
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In the case of Markovian traffic models, such as Markov Additive Processes and 
Markovian Arrival Processes (see D, Lucantun et al., "A single Server Queue with Server 
Vacations and a Class of Non-renewal Arrival" Advances in Applied Probability, vol. 22, 1990, 
pages 676-705, which is hereby incorporated by reference), such functions can be expressed as 
5 the Perron-Ferobenius eigenvalue. Thus, efficient numerical and symbolic computational 
schemes are available. 

Figure 8 is a flowchart outlining an exemplary operation of the present invention when 
optimizing resource allocation of a Web server farm. As shown in Figure 8, the operation starts 
with a request for optimizing resource allocation being received (step 810). In response to 
10 receiving the request for optimization, parameters regarding the Web server farm are provided to 
the models described above and an optimum solution is generated using these models (step 820). 
*S Thereafter, the allocation of resources is modified to reflect the optimum solution generated (step 

830). This may include performing incremental changes in resource allocation in an iterative 
f J manner until measure resource allocation metrics meet the optimum solution or are within a 
ljgj tolerance of the optimum solution. 

^ Figure 9 provides a flowchart outlining an exemplary operation of a resource allocation 

B optimizer in accordance with the present invention. The particular resource allocation optimizer 
shown in Figure 9 is for the GPS case. A similar operation for a PPS resource allocation 

f Ej optimizer may be utilized with the present invention as discussed above. 

Jcf As shown in Figure 9, the problem of finding the optimal arrival rates ktf° for the two 

best effort class are solved (step 910). This is outlined in equation (6) above. Next, the GPS 
parameters f i)k for each server i and class k<K are initialized (step 920). The value of initial 
parameters may be arbitrarily selected or may be based on empirical data indicating values 
providing rapid convergence to an optimal solution. 

25 The "previous" arrival rate and GPS parameters hf (old) and f irk (old) are initialized (step 

930). The choice of initialization values forces the first test to determine convergence offaj® and 
f itk to fail. The class is initialized to K=l (step 940) and the problem outlined in equation (8) is 
solved to obtain the optimal values of the arrival rates given the values of the GPS 
parameters^- (step 950). 
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The class k is then incremented (step 960) and it is determined whether k<K-l (step 970). 
If it is, the operation returns to step 950. Otherwise, the server is initialized to i-1 (step 980). 
The problem outlined in equation (9) is solved to obtain the optimal values of the GPS 
parameters given the values of the arrival rates kf 0 (step 990). 

5 The server i is then incremented (step 1000) and a determination is made as to whether 

z<M(step 1010). If it is, the operation returns to step 990. Otherwise, convergence of the kif } 
values is checked by comparing them with the faj® (old) values (step 1020). If there is no 
convergence, the each old arrival rate valve kif° (old) is reset to be kij® 9 and each old GPS 
parameter f itk (old) is reset to be f ifk (step 1030). The operation then returns to step 940. 
10 If there is convergence in step 1020, convergence of the/!* values is checked by 

comparing them with the f l}k (old) values (step 1040). If there is no convergence the operation 

*S goes to step 1030, described above. Otherwise the optimal arrival rates kj® and the optimal GPS 
parameters for each server z, site j and class K have been identified and the operation 

fit- terminates. 

lSj Thus, the present invention provides a mechanism by which the optimum resource 

4Li allocation may be determined in order to maximize the profit generated by the computing system. 
13 The present invention provides a mechanism for finding the optimal solution by modeling the 
| * computing system based on the premise that revenue is generated when service level agreements 
;i are met and a penalty is paid when service level agreements are not met. Thus, the present 

W invention performs optimum resource allocation using a revenue metric rather than performance 
metrics. 

It is important to note that while the present invention has been described in the context 
of a fully functioning data processing system, those of ordinary skill in the art will appreciate that 
the processes of the present invention are capable of being distributed in the form of a computer 

25 readable medium of instructions and a variety of forms and that the present invention applies 
equally regardless of the particular type of signal bearing media actually used to carry out the 
distribution. Examples of computer readable media include recordable-type media, such as a 
floppy disk, a hard disk drive, a RAM, CD-ROMs, DVD-ROMs, and transmission-type media, 
such as digital and analog communications links, wired or wireless communications links using 

30 transmission forms, such as, for example, radio frequency and light wave transmissions. The 
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computer readable media may take the form of coded formats that are decoded for actual use in a 
particular data processing system. 

The description of the present invention has been presented for purposes of illustration 
and description, and is not intended to be exhaustive or limited to the invention in the form 
5 disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. 
The embodiments were chosen and described in order to best explain the principles of the 
invention, the practical application, and to enable others of ordinary skill in the art to understand 
the invention for various embodiments with various modifications as are suited to the particular 
use contemplated. 
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